Goto

Collaborating Authors

 text-to-image generation


Turbo Learning for CaptionBot and DrawingBot

Neural Information Processing Systems

We study in this paper the problems of both image captioning and text-to-image generation, and present a novel turbo learning approach to jointly training an image-to-text generator (a.k.a.



Lumina-Next: MakingLumina-T2X StrongerandFasterwithNext-DiT

Neural Information Processing Systems

Lumina-T2X is a nascent family of Flow-based Large Diffusion Transformers (Flag-DiT) that establishes a unified framework for transforming noise into various modalities, such as images and videos, conditioned on text instructions.